New name for hint::black_box: pretend_used #74853

jonhoo · 2020-07-28T01:12:49Z

EDIT (2020-07-29): Changed from assume_used to pretend_used following #74853 (comment)

The current name is a legacy from before RFC 2360. The RFC calls the
method bench_black_box, though acknowledges that this name is not be
ideal either.

The main objection to the name during and since the RFC is that it is
not truly a black box to the compiler. Instead, the hint only encourages
the compiler to consider the passed-in value used. This in the hope that
the compiler will materialize that value somewhere, such as in memory or
in a register, and not eliminate the input as dead code.

This PR proposes to rename the method to pretend_used. This clearly
indicates the precise semantics the hint conveys to the compiler,
without suggesting that it disables further compiler optimizations. The
name also reads straightforwardly in code: hint::pretend_used(x) hints
to the compiler that it should pretend that x is used in some
arbitrary way.

pretend_used also rectifies the secondary naming concern the RFC
raised with the names black_box and bench_black_box; the potential
confusion with "boxed" values.

If this change lands, it completes the naming portion of #64102.

rust-highfive · 2020-07-28T01:12:52Z

r? @shepmaster

(rust_highfive has picked a reviewer for you, use r? to override)

jonhoo · 2020-07-28T01:14:55Z

I'd like to make it clear that I opened this PR primarily as a way to direct conversation from #64102 (comment) for a concrete change. If folks don't like the proposed name, I am happy to close this and continue discussion in #64102.

bors · 2020-07-28T03:20:19Z

☔ The latest upstream changes (presumably #73265) made this pull request unmergeable. Please resolve the merge conflicts.

bjorn3 · 2020-07-28T07:50:11Z

assume_used feels to me like the compiler is allowed to assume that the input and output are identical, allowing const folding of the user of the output value, while only forbiding the removal of the creator of the input value.

LukasKalbertodt · 2020-07-28T08:25:52Z

I like the assume_ part, but agree with @bjorn3 in that assume_used seems only fitting for the "eat the result of my calculation" use case, but not for the "hide this constant from the compiler" use case.

petertodd · 2020-07-28T10:24:08Z

Wikipedia describes black boxes as having an opaque implementation, so one option might be simply opaque()

jonhoo · 2020-07-28T12:44:08Z

assume_used feels to me like the compiler is allowed to assume that the input and output are identical, allowing const folding of the user of the output value, while only forbiding the removal of the creator of the input value.

The intention (though I agree the name doesn't quite communicate this) was that the compiler should assume that assume_used uses the provided input value in any legal fashion. That could include taking references to it, mutating it, or anything else it could think of. So constant folding across it should not be okay. I guess it depends on what you take "use" to mean. To me, using a value includes potentially modifying it. Can we think of another word than "use" to indicate that the function does arbitrary (legal) things to the passed-in value?

I like the assume_ part, but agree with @bjorn3 in that assume_used seems only fitting for the "eat the result of my calculation" use case, but not for the "hide this constant from the compiler" use case.

I think my take here is, as above, that the compiler can't constant-fold across assume_used, because it should assume it does not know how the value is used (e.g., the use might include modifications). I agree a different word than "use" may be appropriate, though none immediately come to mind. assume_arbitrary_impl maybe, though that reads much more awkwardly?

Wikipedia describes black boxes as having an opaque implementation, so one option might be simply opaque()

That would work if the body was actually opaque to the compiler, but I don't think that's actually the case. The compiler does still get some insight into what goes on, as per rust-lang/rfcs#2360 (comment), and so calling it opaque may give the reader expectations that are not true.

jonhoo · 2020-07-28T12:48:15Z

@tmiasko I'll respond to your comment over here.

I wonder how would it go along with other assume-like functions in
the standard library. There is assume intrinsic and a family of assume_init
functions, for those it is responsibility of the caller to guarantee the thing
being assumed. In the proposed assume_used, the role of the callee.

Actually, I think the existing assume_ methods set a good precedent for the use of assume. There, like here, we are telling the compiler to assume that something is true from this point forward. For me, the responsibility on the caller comes from those methods being unsafe, not from the assume in the name.

petertodd · 2020-07-29T08:46:12Z

@jonhoo Good point. Another suggestion: pretend_used

As in, we're telling the compiler to pretend that value is used to calculate a new result, even though it isn't.

jonhoo · 2020-07-29T12:36:31Z

Oooh, I quite like pretend_used. We don't have precedence for the use of "pretend" in std (unlike assume), but I don't mind it. If you (and @bjorn3) both prefer pretend_used to assume_used, I'm happy to update the PR.

The current name is a legacy from before RFC 2360. The RFC calls the method `bench_black_box`, though acknowledges that this name is not be ideal either. The main objection to the name during and since the RFC is that it is not truly a black box to the compiler. Instead, the hint only encourages the compiler to consider the passed-in value used. This in the hope that the compiler will materialize that value somewhere, such as in memory or in a register, and not eliminate the input as dead code. This PR proposes to rename the method to `pretend_used`. This clearly indicates the precise semantics the hint conveys to the compiler, without suggesting that it disables further compiler optimizations. The name also reads straightforwardly in code: `hint::pretend_used(x)` hints to the compiler that it should pretend that `x` is used in some arbitrary way. `pretend_used` also rectifies the secondary naming concern the RFC raised with the names `black_box` and `bench_black_box`; the potential confusion with "boxed" values. If this change lands, it completes the naming portion of rust-lang#64102.

jonhoo · 2020-07-29T13:04:48Z

PR updated 👍

petertodd · 2020-07-29T13:07:08Z

Oooh, I quite like pretend_used. We don't have precedence for the use of "pretend" in std (unlike assume), but I don't mind it. If you (and @bjorn3) both prefer pretend_used to assume_used, I'm happy to update the PR.

Re: lack of precedence, I actually think that's an advantage: "assume" is used in production code to enable optimizations by assuming something that is true, is true; "pretend" will be used in testing code to disable optimizations by pretending that something which is not actually true, is true.

Suggested doc text:

"An identity function that asks the compiler to pretend that a value is used in an arbitrary computation, usually to disable optimizations in benchmarking code."

jonhoo · 2020-07-29T20:00:17Z

@LukasKalbertodt and @CryZe: your emoji reactions suggest that you did not like assume_used. How do you feel about the newly proposed pretend_used? If you do not like that either, it would be helpful if you could describe what makes you feel negatively about it so we can see if we can come up with a better name :)

CryZe · 2020-07-29T20:06:34Z

pretend is a lot better but used still seems really ambiguous. Functions like assume_init imply that an assumption about something that is already true outside of the call holds. So in that specific case you need to already have initialized it and the function only tells the compiler about it. This is not the case with assume / pretend used, as in this case it's really something that the function is actively doing, that isn't already the case without the function call. So while pretend is a bit better, the whole function name still seems somewhat off too me, including the doc comment that does not clarify this all too much either.

What about any (combination) of these, which all seem a lot less ambiguous (though maybe some may imply too much?):

keep_around
dont_optimize_away
has_relevant_side_effects
keep_this

jonhoo · 2020-07-29T20:47:54Z

Yeah, the problem we're grappling with is that this method really promises (or rather, hints) quite little. And exactly what the compiler is likely to do as a result is very compiler-dependent. In reality, all the method does is tell the compiler "consider this value as used in every conceivable, but legal way". No more, no less. What that means is up to the compiler. That's why I decided to propose the _used names — they get at what the hint actually is, as opposed to what the compiler might do with that hint.

To use your proposals as examples:

keep_around: this is a little too weak — the hint should tell the compiler to not apply constant folding across the call (since the call may change the value), but this name does not imply that.
dont_optimize_away: a similar issue to the above — it implies that all the hint does is tell the compiler not to remove the value, but in practice the hint is (very slightly) stronger than that.
has_relevant_side_effects: this doesn't seem quite right to me — this might confuse a reader into thinking that the function itself has side effect, when it does not. or alternatively this might read as "assume that the given expression has side-effects, rather than the hint function having side-effects, which would imply that the expression cannot be optimized. but that is not the case.
keep_this: this also implies (to me) that the expression passed to keep_this may not be optimized, whereas that is not the case. it also shares the same issues as above around constant folding.

This is not the case with assume / pretend used, as in this case it's really something that the function is actively doing, that isn't already the case without the function call.

I'm not quite sure I follow this — assume_init, at least as I read it, is telling the compiler to assume that the given value is now initialized. It's the unsafe part that is the caller agreeing to a contract. In the same way that assume/pretend here is a directive to the compiler that something is true. In this case, the thing we're telling the compiler is true is that the passed-in value is being used (even though it doesn't look like it is).

bjorn3 · 2020-07-29T20:53:27Z

assume is normally used to allow the compiler to perform extra optimizations by assuming that something is true. pretend here is used to forbid the compiler from performing certain optimizations by pretending that something is not true. They are eachothers counterpart.

CryZe · 2020-07-29T20:54:27Z

src/test/ui/sanitize/memory.rs

 use std::mem::MaybeUninit;

 #[inline(never)]
 #[no_mangle]
 fn random() -> [isize; 32] {
    let r = unsafe { MaybeUninit::uninit().assume_init() };
    // Avoid optimizing everything out.
-    black_box(r)
+    pretend_used(r)


pretend_use / blackbox / ... is the completely wrong function to call here right? It doesn't help with correctness and this whole code relies on the black box hiding the fact that it's UB to read uninitialized memory from the compiler. (Same in the address sanitizer test)

So, I think what's going on here is actually that we want the code to be incorrect, but without pretend_used (as the comment implies) the r would be entirely optimized out by the compiler. pretend_used here causes the compiler to leave r in, which in turn means that the undefined behavior can be detected by the sanitizer.

CryZe · 2020-07-29T21:23:00Z

Okay, I believe at this point I'm even more confused than before. My understanding of this function is that it's an identity function, but this is intentionally hidden from the compiler (though not guaranteed) to prevent the compiler from knowing it's an identity function, so the compiler can't assume anything about the value anymore, meaning that no constant folding can happen and any other assumptions the compiler made about the value are completely lost.

pretend_used / assume_used seem to completely distract from this, blackbox does a much better job at conveying that. Or what exactly am I missing?

If it is (and it apparently is?) more subtle than that, then the current documentation (and name) do an incredibly bad job of conveying what kind of optimizations exactly are and are not posssible, i.e. using a value, what does that even mean to the user of this API, as someone who doesn't even know about internals of compiler optimizations.

jonhoo · 2020-07-29T21:40:52Z

It definitely is a bit confusing.

Part of the problem is that the actual semantics of the method really are "pretend that the program may have any legal Rust impl for this function". That does include things like constant folding (since it might return a different value than was provided) and dead code elimination (since the method could have side-effects), but it is not truly a black box in the way black box is normally defined in other contexts.

@hanna-kruppe had a good summary over in #64102 (comment), and there's also @cramertj's original comment over in rust-lang/rfcs#2360 (comment). The concern is specifically, assuming I've understood them correctly, that if it is called black_box, people may start to rely on it actually having black-box semantics for correctness purposes. But since this is only a hint to the compiler, that is is under no obligation to follow, those assumptions that unsafe code may rely on aren't guaranteed. A name that tries to not claim to actually be a black box, but instead at what the hint is trying to tell the compiler, means that the function is less likely to be used for correctness.

I completely agree with you that the current documentation is inaccurate. I tried to make it slightly better in this PR, but it's difficult because the "meaning" of this function to the end user is so ill-defined. I definitely do not think the updated docs in this PR are sufficient, they were only meant to be a starting point for making progress on #64102.

LukasKalbertodt · 2020-07-30T19:55:24Z

@jonhoo The prefix pretend_ is great, as many already mentioned. But just like @CryZe, I'm still not convinced by _used, unfortunately. While I know that "use" is a pretty general word, I personally intuitively associate "used" with "read-only". So pretend_used(27) doesn't feel to me like we are pretending that the value might change. (Maybe that's just because I'm not a native speaker; if so please tell me!) So the assume -> pretend change didn't change anything about what I said in my previous comment:

assume_used seems only fitting for the "eat the result of my calculation" use case, but not for the "hide this constant from the compiler" use case.

Just as counter-example, pretend_modified (or pretend_altered) would be on the other side: good for the "hide a constant" case, not so much for the "sink for computation result" case.

I was thinking about a better word since yesterday, but didn't come up with anything useful. I guess pretend_used_and_modified would capture what I'm after, but that's obviously long and meh. Some other ideas to throw out there (though, none of them ideal IMO):

pretend_opaque
remove_assumptions, remove_knowledge
assume_nothing

jonhoo · 2020-07-30T21:24:07Z

Hmm, yeah, I see what you mean. For pretend_used to make sense, the reader does have to read "used" as "any impl", not just "reads". That said, I feel like the documentation could clarify this? The name of the function doesn't need to be perfect, it just has to read mostly intuitively, and crucially shouldn't be thought to give more guarantees that it does (which is the problem with black_box. Giving additional details about the kinds of use that "can be pretended" in the documentation seem (to be) like a fair compromise. But I may also be alone in feeling that way :)

Another take would be to go with something like std::hint::arbitrary_impl?

Mark-Simulacrum · 2020-07-30T21:48:44Z

Arbitrary impl could also be a no-op. In some sense, maybe "assume nothing" is what we want to get at -- i.e., the compiler should not presume that it knows anything about the function call being made and what it's effects are, beyond the normal "Rust guarantees" (i.e., mutable aliasing etc).

But assume_nothing looks quite odd...

jonhoo · 2020-07-30T21:53:20Z

@Mark-Simulacrum Another challenge with assume_nothing is that it implies stronger guarantees that the function actually gives. The problem is pretty similar to the one for black_box — people might use the hint for correctness purposes, thinking that they are telling the compiler to assume nothing about the passed-in value.

There's also a challenge here around the "object" of the verb that the function represents. assume_nothing(3 + x) (and some other names that have come up) imply (to me) that the compiler should assume nothing about 3 + x, rather than about what the assume_nothing function itself does with the value it is passed. The former reading might make the reader think that 3 + x won't be computed at compile time for example, when this function does not actually guarantee that.

CryZe · 2020-07-30T21:58:31Z

Maybe we want may_ / try_ / etc. as a prefix to indicate that it's not actually guaranteed.

jonhoo · 2020-07-30T22:05:43Z

Yeah, that might be a good idea. I prefer try_, but it's so strongly associated with Result return types that I feel like it'd probably be distracting. may_ as a prefix makes it seem more like part of the directive to the compiler of "is allowed to", as opposed to "try to pretend". Maybe suggest_?

jonhoo · 2020-07-30T22:06:15Z

There's also the argument that because this is in the hint module, it is never more than a hint, and should never be relied on for correctness? But that may be too subtle.

jonhoo · 2020-07-30T22:08:20Z

Here's a super weird suggestion: std::hint::arbitrary_safe_code.

petertodd · 2020-07-30T23:33:57Z

Here's a super weird suggestion: `std::hint::arbitrary_safe_code`.

A name like `mock_fn()` might help make it clear this is just meant for testing, not correctness.

jonhoo · 2020-07-31T00:30:42Z

True, though mock_fn also doesn't carry any semantics about what the hint actually is, which I think is vital.

jonhoo · 2020-08-03T19:00:34Z

Another way might be pretend_arbitrarily_used?

petertodd · 2020-08-03T19:25:36Z

Another way might be `pretend_arbitrarily_used`?

I gotta admit, seems kinda long. I'm personally OK if the exact details are 100% fully explained in the docs, which I'd expect anyone seeing "pretend_used" for the first time to read at least once. This is for test code remember, where the ratio of time spent writing the code vs time spent reading it will be higher than the usual case where more time is spent reading than writing.

crlf0710 · 2020-08-05T12:22:52Z

I'm thinking about something like:

fn pretend_modify<T>(v: &mut T);

or even just

fn touch<T>(v: &mut T);

but this is a little too jargon-y, i think...

jonhoo · 2020-08-05T12:29:29Z

fn pretend_modify<T>(v: &mut T);

This probably isn't what we want, since pretend_modify is then limited to doing things that are legal with a &mut T, rather than with a T as currently written.

As for the names, pretend_modify is a little weird to me, since it seems to only pretend that the value is modified. Whereas in reality we want to communicate that the method can use T in any way that is legal in safe Rust. touch I agree is probably too jargon-y, and even if it weren't, it's a little odd since touch doesn't really imply "may read this memory" for example.

tesuji · 2020-08-05T12:39:50Z

How about noopt, keep_unused ?

jonhoo · 2020-08-05T14:37:19Z

I'm not sure what noopt means? If it's short for no_optimization or dont_optimize, those don't work because the compiler still performs optimizations, we just suggest to it that it doesn't perform certain optimizations.

Could you say more about what you think keep_unused conveys?

LukasKalbertodt · 2020-08-05T20:46:11Z

Just throwing a few more things out there:

conceal,
pretend_concealed
conceal_value
conceal_from_compiler
Or any of the above with veil instead of conceal

I think I actually somewhat like conceal and conceal_value. Seem fitting for both situations:

// Hide constant
number_of_primes_below(hint::conceal_value(1000));

// Hide result of computation
vec.sort();
hint::conceal_value(vec);

petertodd · 2020-08-06T00:58:25Z

I think anything named "conceal", "hide" or similar will be confused with cryptography.

…

On August 5, 2020 4:46:26 PM EDT, Lukas Kalbertodt ***@***.***> wrote: Just throwing a few more things out there: - `conceal`, - `pretend_concealed` - `conceal_value` - `conceal_from_compiler` - Or any of the above with `veil` instead of `conceal` I think I actually somewhat like `conceal` and `conceal_value`. Seem fitting for both situations: ```rust // Hide constant number_of_primes_below(hint::conceal_value(1000)); // Hide result of computation vec.sort(); hint::conceal_value(vec); ``` -- You are receiving this because you commented. Reply to this email directly or view it on GitHub: #74853 (comment)

jonhoo · 2020-08-08T14:17:48Z

Here's a different proposal: consume_and_produce.

The idea is that it conveys to the compiler that it will consume the value that is passed to it, in some unspecified way. The produce part is there to make it clear to readers that it will also produce the same value.

Dylan-DPC-zz · 2020-08-25T12:40:18Z

Given there's no consensus here on the naming, I'm going to close this as this stage. I think we could discuss this in an issue or a MCP (once t-libs officially starts doing MCPs) and then rework this. Thanks :)

rust-highfive assigned shepmaster Jul 28, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 28, 2020

jonhoo mentioned this pull request Jul 28, 2020

Tracking issue for std::hint::black_box #64102

Closed

5 tasks

jonhoo force-pushed the assume-used branch from 382da8f to 24d671c Compare July 28, 2020 12:44

jonhoo changed the title ~~New name for hint::black_box: assume_used~~ New name for hint::black_box: pretend_used Jul 29, 2020

jonhoo force-pushed the assume-used branch from 24d671c to 10e0481 Compare July 29, 2020 13:03

Tweak docs a little

940e7be

CryZe reviewed Jul 29, 2020

View reviewed changes

Dylan-DPC-zz closed this Aug 25, 2020

Dylan-DPC-zz added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Aug 25, 2020

New name for hint::black_box: pretend_used #74853

New name for hint::black_box: pretend_used #74853

Conversation

jonhoo commented Jul 28, 2020 • edited Loading

rust-highfive commented Jul 28, 2020

jonhoo commented Jul 28, 2020

bors commented Jul 28, 2020

bjorn3 commented Jul 28, 2020

LukasKalbertodt commented Jul 28, 2020

petertodd commented Jul 28, 2020

jonhoo commented Jul 28, 2020

jonhoo commented Jul 28, 2020

petertodd commented Jul 29, 2020

jonhoo commented Jul 29, 2020

jonhoo commented Jul 29, 2020

petertodd commented Jul 29, 2020

jonhoo commented Jul 29, 2020

CryZe commented Jul 29, 2020 • edited Loading

jonhoo commented Jul 29, 2020

bjorn3 commented Jul 29, 2020 • edited Loading

CryZe Jul 29, 2020

Choose a reason for hiding this comment

jonhoo Jul 29, 2020

Choose a reason for hiding this comment

CryZe commented Jul 29, 2020 • edited Loading

jonhoo commented Jul 29, 2020

LukasKalbertodt commented Jul 30, 2020

jonhoo commented Jul 30, 2020

Mark-Simulacrum commented Jul 30, 2020

jonhoo commented Jul 30, 2020

CryZe commented Jul 30, 2020

jonhoo commented Jul 30, 2020

jonhoo commented Jul 30, 2020

jonhoo commented Jul 30, 2020

petertodd commented Jul 30, 2020 via email

jonhoo commented Jul 31, 2020

jonhoo commented Aug 3, 2020

petertodd commented Aug 3, 2020 via email

crlf0710 commented Aug 5, 2020

jonhoo commented Aug 5, 2020

tesuji commented Aug 5, 2020

jonhoo commented Aug 5, 2020

LukasKalbertodt commented Aug 5, 2020

petertodd commented Aug 6, 2020 via email

jonhoo commented Aug 8, 2020

Dylan-DPC-zz commented Aug 25, 2020

jonhoo commented Jul 28, 2020 •

edited

Loading

CryZe commented Jul 29, 2020 •

edited

Loading

bjorn3 commented Jul 29, 2020 •

edited

Loading

CryZe commented Jul 29, 2020 •

edited

Loading